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Abstract. In high-frequency financial data not only returns, but also waiting times 
between consecutive trades are random variables. Therefore, it is possible to apply 
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shows that the waiting-time survival probability for high-frequency data is non- 
exponential. This fact imposes constraints on agent-based models of financial markets. 
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1. Introduction 

Starting from tlie second half of tlie last decade, due to the availability of large financial 
databases, there has been an increasing interest on the statistical properties of high- 
frequency financial data and on market microstructural properties [H El IHl IH IHl Ej- 
Various studies on high-frequency econometrics appeared in the literature and among 
them autoregressive conditional duration models [3 IHl 13 UH] • 

The basic remark that in high-frequency financial data not only returns but also 
waiting times between consecutive trades are random variables ^T] can already be 
found in previous literature. For instance, it is present in a paper by Lo and McKinlay 
published in the Journal of Econometrics [12], but it can be traced at least to papers 
on the application of compound Poisson processes |T3| and subordinated stochastic 
processes jHj to finance. Compound Poisson processes have been revisited in the recent 
wave of interest in high-frequency data modelling [T^ ITHllTTj . 

Compound Poisson processes belong to the class of continuous-time random walks 
(CTRWs) [IH], which have been recently applied to finance as well (see Sec. 2 for 
details). To our knowledge, the application of CTRW to economics dates back, at least, 
to the 1980s. In 1984, Rudolf Hilfer published a book on the application of stochastic 
processes to operational planning, where CTRWs were used for sale forecasts JH]- The 
(revisited) CTRW formalism has been applied to the high-frequency price dynamics in 
financial markets by our research group since 2000, in a series of three papers [201 1^1221 • 
Other scholars have recently used this formalism fl^ |211 12S] ■ However, CTRWs have 
a famous precursor. In 1903, the PhD thesis of Filip Lundberg presented a model for 
ruin theory of insurance companies, which was further developed by Cramer I27j. 
The underlying stochastic process of the Lundberg- Cramer model is another example 
of compound Poisson process and thus also of CTRW. 

Among other issues, we have studied the independence between log-returns and 
waiting times for the 30 Dow- Jones- Industrial- Average (DJIA) stocks traded at the New 
York Stock Exchange in October 1999. For instance, according to a contingency-table 
analysis performed on General Electric (GE) prices, the null hypothesis of independence 
can be rejected with a significance level of 1 % j2H]- In this paper, however, the focus is 
on the empirical distribution of waiting times P^. 

This paper is divided as follows: Sec. 2 is devoted to a summary of CTRW 
theory as applied in finance; the relation of CTRWs to compound Poisson processes 
will be presented in some detail. In Sec. 3, following our empirical analysis, the reader 
can convince him/herself of the main result of this paper: for the 30 DJIA stocks in 
the period considered (October 1999), the waiting-time survival probability for high- 
frequency data is non-exponential. Finally, in Sec. 4, a possible explanation of this 
anomaly will be discussed using exponential mixtures as the analytical tool. 
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2. Theory 

The importance of random walks in finance has been known since the seminal thesis 
of Bachelier [20] which was completed at the end of the XlXth century, more than a 
hundred years ago. The ideas of Bachelier were further carried out by many scholars 



The price dynamics in financial markets can be mapped onto a random walk whose 
properties are studied in continuous, rather than discrete, time [S21- Here, we shall 
present this mapping, pioneered by Bachelier in a rather general way. It is worth 
mentioning that this approach is related to that of Clark jH] and to the introductory 
notes in Parkinson's paper As a further comment, this is a purely phenomenological 
approach. No specific assumption on the rationality or the behaviour of market agents 
is taken or even necessary. In particular, it is not necessary to assume the validity of the 
efficient market hypothesis P3l I36j . Nonetheless, as shown below, a phenomenological 
model can be useful in order to empirically corroborate or falsify the consequences 
of behavioural or other assumptions on markets. Moreover, the model itself can be 
corroborated or falsified by empirical data. 

As a matter of fact, there are various ways in which random walk can be embedded 
in continuous time. Here, we shall base our approach on the so-called continuous-time 
random walk in which time intervals between successive steps are random variables, as 
discussed by Montroll and Weiss [TH]. 

Let S{t) denote the price of an asset or the value of an index at time t. In a real 
market, prices are fixed when buy orders are matched with sell orders and a transaction 
(trade) occurs. Returns rather than prices are more convenient. For this reason, we 
shall take into account the variable x{t) = log S{t), that is the logarithm of the price. 
Indeed, for a small price variation AS = ^(tj+i) — S(ti), the return r = AS/S(ti) and 
the logarithmic return riog = log[S{ti+i)/ S{ti)] virtually coincide. 

As we mentioned before, in financial markets, not only prices can be modelled as 
random variables, but also waiting times between two consecutive transactions vary in a 
stochastic fashion. Therefore, the time series is characterised by ip{C,, t), the joint 

probability density of log-returns = x(ti+i) — x(ti) and of waiting times Xj = tj+i — tj. 
The joint density satisfies the normalization condition J J dC,dTLp{^, r) = 1. Both C,i and 
Tj are assumed to be independent and identically distributed (i.i.d.) random variables. 

Montroll and Weiss ^H] have shown that the Fourier-Laplace transform of p{x,t), 
the probability density function, pdf, of finding the value x of the price logarithm (which 
is the diffusing quantity in our case) at time t, is: 
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and ipi^r) = J (p{$,, r) is the waiting time pdf. 

The space-time version of eq. (0) can be derived by probabihstic considerations 
PT] . The following integral equation gives the probability density, p{x, t), for the walker 
being in position x at time t, conditioned by the fact that it was in position x = at 
time t = 0: 

ft f + OO 

p{x,t) = 5{x)-^{t)+ i I (p{x-x',t-t')p{x',t')dt'dx', (3) 

Jo J -oo 

where '^{t) is the so-called survival function, '^{t) is related to the marginal waiting- 
time probability density V'(t). The survival function \E'(r) is: 

^{T')dT' = J i){T)dT'. (4) 

The CTRW model can be useful in applications such as speculative option pricing by 
Monte Carlo simulations or portfolio selection. This will be the subject of a forthcoming 
paper. Here, it is more interesting to discuss the relation of this formalism to compound 
Poisson processes. Indeed, compound Poisson processes are an instance of continuous- 
time random walks in which waiting times and log-returns are independent random 
variables; moreover, one assumes that the marginal waiting-time density il){t) is an 
exponential density: 

^(r) = /ie-'^^ (5) 

Therefore, the probability P{n,t) of getting n log-price jumps up to time t is given by 
the Poisson distribution: 

P(r.,t) = ^e-^ (6) 

that is the jump point process is a Poisson process. The log-price x{t) at time t is: 

n(i) 

x{t) = J2^.. (7) 

i=l 

where, as above, n{t) is the number of jumps occurred up to time t. Let A(^) denote 
the marginal log-return density, then the solution of eq. ^ is: 

p(x,t) = f;^^e-'^%(x), (8) 

n=0 

where A„ is the n-fold convolution of the density A. Eq. (jHl) can be also derived by 
purely probabilistic consideration. The interested reader can find more information on 
a generalization of this case in a recent paper of our group [^H] ■ An important property of 
CTRWs is that log-returns and waiting times are independent and identically distributed 
random variables. Still, there can be a dependence between the two random variables. 
If they are independent, as in the case of compound Poisson processes, the joint pdf 
(p{^, t) is given by the product of the two marginal densities: 

¥'(e,r) = A(eMr); (9) 



5 



if they are not independent, then, according to the definition of conditional probabihty, 
one has: 



where iP{t\C,) and A(^|r) are conditional probabihty densities. Note, however, that 
autoregressive conditional duration models introduce a dependence between waiting 
times and this feature cannot be captured by the above formalism, as waiting times are 
assumed to be i.i.d. random variables (see also ref. P^). 

3. Empirical evidence 

3.1. The data set 

The data set consists of nearly 800,000 prices S(ti) and times of execution ti obtained 
from the TAQ database of the NYSE. These data were appropriately filtered in order to 
remove misprints in prices and times of execution and correspond to the high-frequency 
trades registered at NYSE in October 1999, for the 30 stocks of the Dow Jones Industrial 
Average Index, namely, at that time: AA, AID, AXP, BA, C, CAT, CHV, DD, DIS, 
EK, GE, GM, GT, HWP, IBM, IP, JNJ, JPM, KO, MOD, MMM, MO, MRK, PG, 
S, T, UK, UTX, WMT, XON. The choice of one month of high-frequency data was 
a trade off between the necessity of managing enough data for significant statistical 
analyses and and, on the other hand, the goal of minimizing the effect of external 
economic fluctuations. The reader can determine the company to which the above 
symbols correspond just by consulting the NYSE web pages (www.nyse.com). 

In order to roughly evidence intraday patterns the data set has been divided 
into three daily periods: morning (from 9:00 to 10:59), midday (from 11:00 to 13:59) 
and afternoon (from 14:00 to 17:00). In Table 1, the number of trades for each daily 
period is given as a function of the stock. 

3.2. Empirical analysis 

In Fig. 1, the waiting-time complementary cumulative distribution function (or survival 
function) ^^(r) = 1 — ip{t')dt' is plotted for three different periods of the day and for 
the GE time series of October 1999. In the above formula, iP{t) represents the marginal 
waiting-time probability density function, ^^(t) gives the probability that the waiting 
time between two consecutive trades is greater than the given r. The lines are the 
corresponding standard exponential complementary cumulative distribution functions: 



where tq is the empirical average waiting time. An eye inspection already shows 
the deviation of the real distribution from the exponential distribution. This fact is 
corroborated by the Anderson-Darling test [lU]- According to this test, for a large 



^(e,r) = A(0^(r|0 = A(e|r)^(r) 



(10) 



^(r) = exp(-r/ro) 



(11) 
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Stock 


111 (9:00-10:59) 


n2 (11:00-13:59) 


n3 (14:00-17:00) 


AA 


4098 


5662 


5298 


ALD 


5248 


7367 


6504 


AXP 


9054 


12267 


12988 


BA 


5058 


7080 


6717 


C 


15628 


21578 


18541 


CAT 


3596 


5361 


4790 


CHV 


4973 


6608 


5591 


DD 


5284 


7363 


6913 


DIS 


7160 


10501 


9182 


EK 


3218 


4433 


4174 


GE 


16063 


20214 


19372 


GM 


16134 


4340 


6173 


GT 


3124 


4105 


3968 


HWP 


10278 


14095 


12062 


IBM 


12534 


22668 


16633 


IP 


4358 


6263 


5590 


JNJ 


6693 


9856 


8644 


JPM 


6410 


7704 


7991 


KO 


8511 


12437 


10575 


MOD 


5611 


7729 


6895 


MMM 


3578 


5398 


4996 


MO 


9680 


14565 


11852 


MRK 


9222 


13462 


11587 


PG 


6809 


9598 


8482 


s 


4694 


5838 


5319 


T 


12291 


18598 


14391 


UK 


2738 


3305 


3208 


UTX 


3745 


5765 


5249 


WMT 


8344 


12446 


10256 


XON 


9321 


11669 


10838 



Table 1. For each daily period, the total number of corrected monthly trades is given 
for each DJIA stock traded in October 1999. 



number of samples, one has to compute the following statistics, after ordering the 
samples Tj in ascending order: 

0.6" 



= [-m - S] 



1 + 



m 



(12) 
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where m is the total number of samples and 5* is 

^ = E ^^^^MHr^)] + ln[l - (13) 

i=l 

where F is the survival function. In order to test the exponential distribution, one must 
insert in the above formula the survival function (|TT|l with tq taken from the empirical 
estimates in Table 2. In the case of GE (Fig. 1), the Anderson-Darling (AD) values 
for the three daily periods are, respectively, 352, 285, and 446. Therefore, the null 
hypothesis of exponential distribution can be rejected at the 1 % significance level as 
the limit value is 1.957. 

In Table 2, the values of the AD statistics are given for all the 30 DJIA stocks 
traded in October 1999. In all these cases the null hypothesis of exponentiality can be 
rejected at the 1 % significance level. 

It is interesting to observe that the average waiting time is sytematically and 
significantly larger at midday than in the morning or in the afternoon. This results 
points to a variable NYSE trade activity and is in agreement with previously reported 
behaviour in stock markets IIH] • This fact has a biological explanation. Around 

midday the activity is slower as traders move from their desks to eat. In fact, as will 
be seen, these intra-day variations in trading activity may also account for the reported 
anomaly in the distribution of waiting times. 

3.3. Independent results corroborating this study 

Our study demonstrates that the marginal density for waiting times is definitely not 
an exponential function. After the publication of our paper series ^13 El 122] , different 
waiting-time scales have been investigated in different markets by various authors. All 
these empirical analyses corroborate the waiting-time anomalous behaviour. A study 
on the waiting times in a contemporary FOREX exchange and in the XlXth century 
Irish stock market was presented by Sabatelli et al. [H]. They were able to fit the 
Irish data by means of a Mittag-Leffler function as we did before in a paper on the 
waiting-time marginal distribution in the German-bund future market pT. Kyungsik 
Kim and Seong-Min Yoon studied the tick dynamical behavior of the bond futures in 
Korean Futures Exchange (KOFEX) market and found that the survival probability 
displays a stretched-exponential form ^2]- Moreover, just to stress the relevance of 
non-exponential waiting times, a power-law distribution has been recently detected by 
T. Kaizoji and M. Kaizoji in analyzing the calm time interval of price changes in the 
Japanese market jlSj. 

4. Discussion and conclusions 

Why should we care about these empirical findings on the waiting-time distribution? 
This has to do both with the market price formation mechanisms and with the bid-ask 
process. A priori, one could argue that there is no strong reason for independent market 
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Stock 


V / 


V / 


V / 






A^(af) 


AA 


27.1 


40.0 


28.8 


29.2 


66.0 


44.8 


ALD 


21.2 


30.8 


23.4 


21.8 


55.5 


33.8 


AXP 


11.8 


18.5 


11.7 


81.7 


102.5 


130.7 


BA 


22.0 


32.0 


22.6 


17.4 


20.2 


21.2 


C 


7.1 


10.5 


8.2 


252.2 


142.8 


210.7 


CAT 


29.2 


42.4 


31.6 


72.3 


128.7 


64.6 


CHV 


22.1 


34.3 


27.1 


104.4 


121.5 


64.9 


DD 


20.3 


30.8 


22.1 


22.9 


44.3 


36.1 


DIS 


15.2 


20.8 


16.6 


53.4 


53.4 


74.7 


EK 


34.1 


51.2 


36.3 


24.8 


34.8 


44.3 


GE 


7.0 


11.3 


7.9 


351.9 


284.7 


445.6 


GM 


24.6 


36.6 


27.0 


22.4 


60.8 


40.9 


GT 


34.3 


55.5 


37.9 


73.7 


95.7 


54.1 


HWP 


10.4 


16.1 


12.7 


94.8 


77.8 


100.8 


IBM 


8.9 


10.0 


9.2 


409.6 


472.5 


489.5 


IP 


24.8 


36.3 


27.0 


25.0 


37.2 


19.4 


JNJ 


16.1 


23.0 


17.7 


30.4 


35.6 


38.0 


JPM 


17.0 


29.5 


19.0 


33.0 


85.2 


85.8 


KO 


12.9 


18.3 


14.4 


44.5 


37.8 


44.1 


MCD 


19.4 


29.3 


22.1 


40.9 


72.7 


44.1 


MMM 


30.1 


42.0 


30.4 


80.1 


86.8 


37.5 


MO 


11.4 


15.6 


12.9 


74.2 


89.0 


75.2 


MRK 


11.7 


16.8 


13.2 


133.1 


136.0 


189.8 


PG 


16.2 


23.6 


17.9 


43.5 


37.2 


48.8 


S 


23.4 


38.8 


28.6 


40.1 


23.0 


41.6 


T 


8.8 


12.2 


10.6 


193.2 


179.1 


208.9 


UK 


40.4 


69.1 


46.7 


33.8 


72.4 


47.2 


UTX 


28.5 


39.3 


29.0 


33.7 


62.9 


58.0 


WMT 


12.5 


18.2 


14.9 


105.2 


110.6 


139.1 


XON 


12.0 


19.6 


14.1 


104.8 


121.4 


129.0 



Table 2. For each daily period, the table gives the values of the empirical average 
waiting time tq and the AD statistics jlD] . 
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General Electric Corporation (DJIA) October 1999 




20 40 60 80 100 120 140 160 180 200 

T(S) 



Figure 1. Waiting-time complementary cumulative distribution function 'I'(t) for 
GE trades quoted at NYSE in October 1999. Open diamonds represent 4'(t) for the 
morning hours (9:00 - 10:59). There were 16063 trades in this period in October 1999. 
The solid line is the corresponding standard exponential complementary cumulative 
distribution function with tq = 7.0 s. Open circles represent ^(r) for the period around 
midday (11:00 ~ 13:59). There were 20214 trades in this period in October 1999. 
The dashed line is the corresponding standard exponential complementary cumulative 
distribution function with tq = 11.3 s. Open squares represent v1/(t) for the afternoon 
hours (14:00 - 17:00). There were 19372 trades in this period in October 1999. The 
dash-dotted line is the corresponding standard exponential complementary cumulative 
distribution function with tq = 7.9 s. The day was divided into three periods to 
evidence seasonalities (see text for explanation). 

investors to place buy and sell orders in a time-correlated way. This argument would lead 
one to expect a Poisson process. If price formation were a simple thinning of the bid-ask 
process, then exponential waiting times should be expected between consecutive trades 
as well jnZ]. Eventually, even if empirical analyses should show that time correlations 
are already present at the bid-ask level, it would be interesting to understand why they 
are there. In other words, the empirical results on the survival probability set limits 
on statistical market models for price formation. A possibly correlated result has been 
recently obtained by Fabrizio Lillo and Doyne Farmer, who find that the signs of orders 
in the London Stock Exchange obey a long-memory process jlTj as well as by Jean 
Philippe Bouchaud and coworkers j^. Further studies on market microstructure will 
be necessary to clarify this point. 
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However, it is possible to offer a simple explanation of the anomalous behaviour in 
terms of exponential mixtures due to variable activity during the trading day. 

Let us introduce a toy model of variable activity during a trading day. The trading 
day can be divided into subintervals where waiting times follow an exponential 
distribution with different average waiting times ro,i, . . . ,ro,Ar. Just recalling that the 
rate is the inverse of the average waiting time: /i, = l/ro,i, one has that the survival 
function is given by: 

N 

vl/(r) = 5^a,e-'^'^ (14) 

i 

where Oj are suitable weights whose sum J^iLi^^i must be 1, to fulfill the condition 
^'(O) = 1. This sum of exponential components is itself non-exponential. For illustrative 
purposes, in Fig. 2, the reader can find the comparison between eq. (|T^ and simulated 
data in which the day had been divided into 10 intervals of equal weight. In each interval 
the average waiting time between trades was a constant and the waiting times followed 
an exponential distribution. The value of the constant increased from 10 to 50 seconds in 
the first five intervals and then decreased from 40 to 5 seconds in the last five intervals, so 
that the sequence of waiting times (in seconds: 10,20,30,40,50,40,30,20,10,5) is a rough 
representation of the activity in a real financial market. The open circles are the survival 
function of the Monte Carlo simulation, the solid line represents the single exponential 
fit of the survival function, whereas, the crosses are values of the survival function 
computed according to eq. ()14|1 with = 1/10. Even if for long waiting times, the tail 
of the distribution is again exponential with rate /ij = 1/5, the exponential mixture can 
describe deviations from the single exponential law for short and intermediate waiting 
times. 

The probability density corresponding to eq. ^T^\ can be formally written in the 
following way: 

N 

^(r) = 5^/i,e-^»- (15) 

i=l 

Eq. (fTH|l can be readily extended to a continuous spectrum of rates, g{fi): 

POO 

V'(r)= / iie->^-g{ii)dix, (16) 
Jo 

where the condition J g{fi) dfi = 1 must hold. Indeed, the integral equation (fTI)|l reduces 
to eq. (fT3|) if g{fi) has the following form: 

N 
i=l 

where 6{») is Dirac's generalized function and Xlili c^j = 1- 

In conclusion, we have shown that, in October 1999, waiting times between 
consecutive trades in the 30 NYSE DJIA stocks were non-exponentially distributed. 
We have summarized other recent results pointing to the same conclusions for different 
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50 100 150 200 250 

T 

Figure 2. Waiting-time complementary cumulative distribution function ^'(t) for 

simulated data (open circles) compared to a simple exponential fit (solid line) and to 
a mixture of exponentials (crosses). See text for details. 

markets. We have argued that this fact has imphcations for market microstructural 
models that should be able to reproduce such a non-exponential behaviour to be realistic. 
Finally, we have offered a possible explanation in terms of variable trading activity 
during the day. 
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